Implicit Pronunciation Modelling in Asr
نویسنده
چکیده
Modelling of pronunciation variability is an important part of the acoustic model of a speech recognition system. Good pronunciation models contribute to the robustness and portability of a speech recogniser. Usually pronunciation modelling is associated with the recognition lexicon which allows a direct control of HMM selection. However, in state-of-the-art systems the use of clustering techniques has considerable cross-effects for the dictionary design. Most large vocabulary speech recognition systems make use of a dictionary with multiple possible pronunciation variants per word. In this paper a method for a consistent reduction of the number of pronunciation variants to one pronunciation per word is described. Using the single pronunciation dictionaries similar or better word error rate performance is achieved both on Wall Street Journal and Switchboard data.
منابع مشابه
A study of implicit and explicit modeling of coarticulation and pronunciation variation
In this paper, we focus on the modeling of coarticulation and pronunciation variation in Automatic Speech Recognition systems (ASR). Most ASR systems explicitly describe these production phenomena through context-dependent phoneme models and multiple pronunciation lexicons. Here, we explore the potential benefit of using feature spaces covering longer time segments in terms of implicit modeling...
متن کاملPronunciation Modelling of Foreign Words for Sepedi ASR
This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to mode...
متن کاملEvaluation of Pronunciation Variants in the ASR Lexicon for Different Speaking Styles
One of the challenges in automatic speech recognition is how to handle pronunciation variation. The main causes for pronunciation variation are the speaker (voice characteristics, accent, non-nativeness etc.) and the speaking style (reading, spontaneous responses, conversation etc.). An ASR system has basically two options for modelling the variation on the word and sub-word level: lexical mode...
متن کاملUsing Auxiliary Sources of Knowledge for Automatic Speech Recognition
Standard hidden Markov model (HMM) based automatic speech recognition (ASR) systems usually use cepstral features as acoustic observation and phonemes as subword units. Speech signal exhibits wide range of variability such as, due to environmental variation, speaker variation. This leads to different kinds of mismatch, such as, mismatch between acoustic features and acoustic models or mismatch ...
متن کاملPronunciation Modeling for Large Vocabulary Speech Recognition by Arthur
The large pronunciation variability of words in conversational speech is one of the major causes of low accuracy for automatic speech recognition (ASR). Many pronunciation modeling approaches have been developed to address this problem. Some explicitly manipulate the pronunciation dictionary as well as the set of the units used to define the pronunciations of words. Others model the pronunciati...
متن کامل